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Claims 

What is claimed is: 

1 . A method for managing a fault comprising: 

detecting an error; 

gathering data associated with the error to generate an error event; and 
categorizing the error event using a hierarchical organization of the error event. 

2. The method of claim 1, further comprising: 

diagnosing the error using the error event to generate the fault; 
generating a fault event using the fault; and 

categorizing the fault event using a hierarchical organization of the fault event. 

3. The method of claim 2, further comprising: 

organizing the fault event using an error numeric association component, wherein 
the error numeric association component uniquely identifies the error event. 

4. The method of claim 2, wherein a class component of the fault event defines a name 
of the fault event in accordance with the hierarchical organization. 

5. The method of claim 2, further comprising: 

forwarding the fault event to a fault management architecture agent. 

6. The method of claim 1, further comprising: 

organizing the error event using an error numeric association component, wherein 
the error numeric association component uniquely identifies the error event. 

7. The method of claim 1, wherein a class component of the error event defines a name 
of the error event in accordance with the hierarchical organization. 
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8. The method of claim 1, wherein gathering data associated with the error comprises 
gathering data to populate the following components within the error event 
comprising: 

a version component defining a version of a protocol used to define the error 
event; 

a class component defining a name of the error event using the hierarchical 

organization of the error event; 
an error numeric association component uniquely identifying the error event; 
a detector component identifying a resource that detected the error; and 
a recoverable component indicating whether the error handler designated the error 

as recoverable. 

9. The method of claim 8, wherein gathering data associated with the error further 
comprises: 

gathering data to populate a disposition component within the error event, wherein 
the disposition component indicates a result of an error handler attempt to 
correct the error. 

10. The method of claim 9, wherein the disposition component comprises at least one 
selected from the group consisting of uncorrected, self-corrected, uncorrectable, and 
soft-corrected. 

11. The method of claim 8, wherein the class component is defined using a string 
representation of a hierarchical organization of the error event. 

12. The method of claim 8, wherein the error numeric association component is defined 
using at least one format selected from the group consisting of Format 0, Format 1, 
Format 2, and Format 3. 
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13. The method of claim 12, wherein Format 0 comprises a pointer to an error 
propagation tree. 

14. The method of claim 12, wherein Format 1 comprises a time element, a central 
processing unit identification element, and a generation element. 

15. The method of claim 12, wherein Format 2 comprises a time element, a sequence 
number element, and a generation element. 

16. The method of claim 12, wherein Format 3 comprises code indicating an extended 
format field. 

17. The method of claim 8, wherein the detector component is defined using a fault 
managed resource identifier. 

18. The method of claim 17, wherein the fault managed resource identifier comprises an 
authority element. 

19. The method of claim 17, wherein fault managed resource identifier is defined using a 
scheme. 

20. The method of claim 19, wherein the scheme comprises at least one selected from the 
group consisting of a hardware component scheme, a diagnosis-engine scheme, a 
device scheme, and a service scheme. 

21. The method of claim 2, wherein generating the fault event comprises gathering data 
to populate the following components in the fault event: 

a version component defining a version of a protocol used to define the fault 
event; 

a class component defining a name of the fault event using the hierarchical 
organization of the fault event; 
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a diagnosis engine identifier component identifying a diagnosis engine used to 
obtain the fault; 

an error numeric association list component including at least one error numeric 
association; 

an automatic system reconfiguration unit component defining a unit that may be 

reconfigured in response to the fault; 
a resource component defining a finest-grain resource identified by the diagnosis 

engine; 

a field replaceable unit component defining a unit that must be repaired to clear 
the fault; and 

a certainty component identifying a level of certainty attributed to the fault 
diagnosed by the diagnosis engine. 

22. The fault event of claim 21, wherein generating the fault event further comprises: 

gathering data to populate a fault diagnosis time component, wherein the fault 
diagnosis component further comprises indicating a time the diagnosis of 
the fault was performed. 

23. The method of claim 21, wherein the class component is defined using a string 
representation of a hierarchical organization of the error event. 

24. The method of claim 21, wherein the error numeric association component is defined 
using at least one format selected from the group consisting of Format 0, Format 1, 
Format 2, and Format 3. 

25. The method of claim 24, wherein Format 0 comprises a pointer to an error 
propagation tree. 

26. The method of claim 24, wherein Format 1 comprises a time element, a central 
processing unit identification element, and a generation element. 
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27. The method of claim 24, wherein Format 2 comprises a time element, a sequence 
number element, and a generation element. 

28. The method of claim 24, wherein Format 3 comprises code indicating an extended 
format field. 

29. The method of claim 21, wherein at least one selected from the group consisting of 
the diagnosis engine identifier component, the automatic system reconfiguration unity 
component, the field recoverable unit component and the resource component is 
defined using a fault managed resource identifier. 

30. The method of claim 29, wherein the fault managed resource identifier comprises an 
authority element. 

31. The method of claim 29, wherein the fault managed resource identifier is defined 
using a scheme. 

32. The method of claim 31, wherein the scheme comprises at least one selected from the 
group consisting of a hardware component scheme, a diagnosis-engine scheme, a 
device scheme, and a service scheme. 

33. The method of claim 2, wherein generating a fault event comprises associating the 
fault event with a suspect list. 

34. The method of claim 33, wherein the suspect list comprises: 

a universal unique identifier component identifying the suspect list; 

a diagnosis engine identifier component identifying a diagnosis engine used to 

diagnosis the error event that subsequently generated the fault event; and 
a fault-events component listing the fault event. 



32 



PATENT APPLICATION 
ATTORNEY DOCKET NO. 03226.335001 ; SUN040224 

35. A system for managing a fault comprising: 

an error handler detecting an error and generating an error event using the error, 
wherein the error is defined using a hierarchical organization of the error 
event; 

a fault manager diagnosing the error event to obtain the fault and generating a fault 
event using the fault, wherein the fault event is defined using a hierarchical 
organization of the fault event; and 

a fault management architecture agent receiving the fault event and initiating an 
action in accordance with the fault event. 

36. The system of claim 35, wherein the fault manager organizes the error event using 
an error numeric association component of the error event. 

37. The system of claim 35, wherein the fault management architecture agent organizes 
the fault event using an error numeric association component of the fault event. 

38. The system of claim 35, wherein the error event comprises: 

a version component defining a version of a protocol used to define the error 
event; 

a class component defining a name of the error event using the hierarchical 

organization of the error event; 
an error numeric association component uniquely identifying the error event; 
a detector component identifying a resource that detected the error; and 
a recoverable component indicating whether the error handler designated the error 

as recoverable. 

39. The system of claim 38, wherein the error event further comprises: 

a disposition component indicating a result of an error handler attempt to correct 
the error. 
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40. The system of claim 39, wherein the disposition component comprises at least one 
selected from the group consisting of uncorrected, self-corrected, uncorrectable, and 
soft-corrected. v 

41. The system of claim 38, wherein the class component is defined using a string 
representation of a hierarchical organization of the error event. 

42. The system of claim 38, wherein the error numeric association component is defined 
using at least one format selected from the group consisting of Format 0, Format 1, 
Format 2, and Format 3. 

43. The system of claim 42, wherein Format 0 comprises a pointer to an error 
propagation tree. 

44. The system of claim 42, wherein Format 1 comprises a time element, a central 
processing unit identification element, and a generation element. 

45. The system of claim 42, wherein Format 2 comprises a time element, a sequence 
number element, and a generation element. 

46. The system of claim 42, wherein Format 3 comprises code indicating an extended 
format field. 

47. The system of claim 38, wherein the detector component is defined using a fault 
managed resource identifier. 

48. The system of claim 47, wherein the fault managed resource identifier comprises an 
authority element. 

49. The system of claim 47, wherein the fault managed resource identifier is defined 
using a scheme. 
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50. The system of claim 47, wherein the scheme comprises at least one selected from the 
group consisting of a hardware component scheme, a diagnosis-engine scheme, a 
device scheme, and a service scheme. 

51 . The system of claim 38, wherein the fault event comprises: 

a version component defining a version of a protocol used to define the fault 
event; 

a class component defining a name of the fault event using the hierarchical 

organization of the fault event; 
a diagnosis engine identifier component identifying a diagnosis engine used to 

obtain the fault; 

an error numeric association list component including at least one error numeric 
association; 

an automatic system reconfiguration unit component defining a unit that may be 

reconfigured in response to the fault; 
a resource component defining a finest-grain resource identified by the diagnosis 

engine; 

a field replaceable unit component defining a unit that must be repaired to clear 
the fault; and 

a certainty component identifying a level of certainty attributed to the fault 
diagnosed by the diagnosis engine. 

52. The system of claim 51, wherein the fault event further comprises: 

a fault diagnosis time component indicating a time the diagnosis of the fault is 
performed. 

53. The system of claim 51, wherein the class component is defined using a string 
representation of a hierarchical organization of the error event. 
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54. The system of claim 51, wherein the error numeric association component is defined 
using at least one format selected from the group consisting of Format 0, Format 1, 
Format 2, and Format 3. 

55. The system of claim 54, wherein Format 0 comprises a pointer to an error 
propagation tree. 

56. The system of claim 54, wherein Format 1 comprises a time element, a central 
processing unit identification element, and a generation element. 

57. The system of claim 54, wherein Format 2 comprises a time element, a sequence 
number element, and a generation element. 

58. The system of claim 54, wherein Format 3 comprises code indicating an extended 
format field. 

59. The system of claim 51, wherein at least one selected from the group consisting of 
the diagnosis engine identifier component, the automatic system reconfiguration unity 
component, the field recoverable unit component, and the resource component is 
defined using a fault managed resource identifier. 

60. The system of claim 59, wherein the fault managed resource identifier comprises an 
authority element. 

61. The system of claim 59, wherein the fault managed resource identifier is defined 
using a scheme. 

62. The system of claim 61, wherein the scheme comprises at least one selected from the 
group consisting of a hardware component scheme, a diagnosis-engine scheme, a 
device scheme, and a service scheme. 

63. The system of claim 35, wherein the fault event is included in a suspect list. 
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64. The system of claim 63, wherein the suspect list comprises: 

a universal unique identifier component identifying the suspect list; 

a diagnosis engine identifier component identifying a diagnosis engine used to 

diagnosis the error event that subsequently generated the fault event; and 
a fault-events component listing the fault event. 

65. The system of claim 35, wherein diagnosing the error event comprising forwarding 
the error event to a diagnosis engine. 

66. A network system having a plurality of nodes, comprising: 

an error handler detecting an error and generating an error event using the error, 
wherein the error is defined using a hierarchical organization of the error 
event; 

a fault manager diagnosing the error event to obtain the fault and generating a fault 

event using the fault, wherein the fault event is defined using a hierarchical 

organization of the fault event; and 
a fault management architecture agent receiving the fault event and initiating an 

action in accordance with the fault event, 
wherein the error handler executes on any node of the plurality of nodes, 
wherein the fault manager executes on any node of the plurality of nodes, and 
wherein the fault management architecture agent executes on any node of the 

plurality of nodes. 

67. The network system of claim 66, wherein the error is detected on a first node of the 
plurality of nodes and the error event is generated on a second node of the plurality of 
nodes. 

68. The network system of claim 66, wherein the error event is received by the fault 
manager on a first node of the plurality of nodes and the error event is diagnosed by a 
diagnosis engine on a second node of the plurality of nodes. 
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69. The network system of claim 66, wherein the fault manager organizes the error event 
using an error numeric association component of the error event. 

70. The network system of claim 66, wherein the fault management architecture agent 
organizes the fault event using an error numeric association component of the fault 
event. 
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